Convex Optimization 2015.09.16
Convex Optimization 2015.09.16
\(\min f_0(x)\)
\(s.t. f_i(x) \le b_i, i = 1 ... m\)
Here \(x \in \mathbb{R}^n, f_i : \mathbb{R}^n \rightarrow R, i = 0 ... m\) are convex function.
I.e. \(f_i(\theta x + (1 - \theta) y) \le \theta f(x) + (1 - \theta) f(y)\) for all \(\theta \in [0, 1]\) and all \(x, y \in dom f_i\)
where \(dom f_i \subseteq \mathbb{R}^n\) is a convex set
where \(C \subseteq \mathbb{R}^n\) is convex if and only if \(\theta x + (1 - \theta) y \in C\) for all \(x, y \in C\), for all \(\theta \in [0, 1]\)
- Fact: Set of \(x\)s \(s.t. f_i(x) \le b\) is a convex set. 
- Proof: Take \(x, y, s.t. f_i(x) \le b_i, f_i(y) \le b_i\) and \(\theta \in [0, 1]\), - Then \(f_i(\theta x + (1 - \theta) y) \le \theta f_i(x) + (1 - \theta) f_i(y) \le \theta b_i + (1 - \theta) b_i = b_i\) 
- Fact: The function \(f: \mathbb{R} \rightarrow \mathbb{R}, f(x) = x^2\) is convex. 
- Proof: \((\theta x + (1 - \theta) y)^2 = \theta^2 + x^2 + 2\theta ( 1 - \theta) xy + (1 - \theta)^2 y^2\) - \(=[\theta + \theta^2 - \theta] x^2 + 2\theta (1 - \theta) xy + [(1 - \theta) + (1 - \theta)^2 - (1 - \theta)] y^2\) - \(=\theta x^2 + (1 - \theta) y^2 + \theta(\theta - 1) x^2 + 2 \theta (1 - \theta) xy + (1 - \theta)(- \theta) y^2\) - \(=\theta f(x) + (1 - \theta) f(y) + \theta (\theta - 1) (x - y)^2\) - \(\ge \theta f(x) + (1 - \theta) f(y)\) 
- All linear function is convex function. 
- Fact: Let \(A \in \mathbb{R}^{m \times n}, b \in \mathbb{R}^m, f: \mathbb{R}^m\) be convex, - Then \(g:\mathbb{R}^n\) defined by \(g(x) = f(Ax + b)\) is convex. 
- Proof: Let \(x, y \in dom g, \theta \in [0, 1]\) - Then \(A(\theta x + (1 - \theta) y) + b = \theta (Ax + b) + (1 - \theta)(Ay + b) \in dom f\) - Take \(x, y \in dom g, \theta \in [0, 1]\) - Then \(g(\theta x + (1 - \theta) y) = f(\theta (Ax + b) + (1 - \theta)(Ay + b)) \le \theta f(Ax + b) + (1 - \theta) f(Ay + b) = \theta g(x) + (1 - \theta) g(y)\) 
- Fact: The sum of two convex functions is convex. 
- Note: Proof starts by nothing that if \(f\), \(g\) are convex functions, then \(dom(f + g) = dom f \cap dom g\) is convex, because intersection of convex sets is convex. 
- Example: The function \(f : \mathbb{R}^n \rightarrow \mathbb{R}\) defined by \(f(x) = \lVert Ax - b \rVert _2^2\) is convex. 
- Proof: \(\lVert Ax - b \rVert _2^2 = (a_1^T x - b_1)^2 + ... + (a_m^T x - b_m)^2\) is a sum of convex functions. 
- Example: \(\min \lVert Ax - b \rVert _2^2\) is a convex optimization problem. This kind of problem is known as a least-square problem. 
- Definition: Norms: A function \(f : \mathbb{R}^n \rightarrow \mathbb{R}\) is a norm if and only if - \(f(x) \ge 0\) for all \(x \in \mathbb{R}^n\) non-negetivity - \(f(x) = 0 \iff x = 0\) definiteness - \(f(t \cdot x) = \lvert t \rvert \cdot \lVert x \rVert\) for all \(t \in \mathbb{R}, x \in \mathbb{R}^n\) homogeneity - \(f(x + y) \le f(x) + f(y), \forall x, y \in \mathbb{R}^n\) triangle inequality 
- Example: For \(p \ge 1\), \(\lVert x \rVert _p = (\lvert x_1 \rvert ^p + ... + \lvert x_n \rvert ^p)^{\frac{1}{p}}\) is a norm - \(\lVert x \rVert _\infty = \max_{i = 1 ... n} \lvert x_i \rvert\) - Geometric Picture: The set \(B_1 = {x \in \mathbb{R}^n : \lVert x \rVert \le 1}\) is bounded, convex, centrally symmetric at 0, nonempty interior. - (If \(C \subseteq \mathbb{R}^n\) the interior \(C^0\) of \(C\) is the set \(\{x \in C : \{y : \lVert x - y \rVert _2 \lt r\} \subseteq C\), for some \(r \gt 0\}\)) 
- Definition: We say that sequence \((Z_i)_{i = 1}^{\infty}\) converges to \(z\) under the norm \(\lVert \cdot \rVert\) if and only if for every \(\varepsilon \gt 0\) there exists \(N \in \mathbb{N}, s.t. \lVert z_i - z \rVert \lt \varepsilon\) for all \(i \ge N\) 
- Fact: In \(\mathbb{R}^n\), convergence happens in one norm if and only if it happens in any other norm. - Let \(\lVert \cdot \rVert\) be a norm in \(\mathbb{R}^n\), - Then \(\exists c_1, c_2 \gt 0\), such that \(c_1 \lVert x \rVert _1 \lt \lVert x \rVert \lt c_2 \lVert x \rVert _1\) for all \(x \in \mathbb{R}^n\) 
- Lemma 1: \(\lVert x - y \rVert \ge \lVert x \rVert - \lVert y \rVert\) for any norm Triangle inequality 
- Proof: \(\lVert x \rVert = \lVert y + (x - y) \rVert \le \lVert y \rVert + \lVert x - y \rVert\) 
- Lemma 2: The norm \(\lVert \cdot \rVert\) is continous with respect to \(\lVert \cdot \rVert _1\). - I.e., for any \(x \in \mathbb{R}^n\), for any \(\varepsilon \gt 0\), there exists \(\delta \gt 0\) - \(s.t. \lVert x - y \rVert _1 \lt \delta \implies \lvert \lVert x \rVert - \lVert y \rVert \rvert \lt \varepsilon\) 
- Proof: \(\lvert \lVert x \rVert - \lVert y \rVert \rvert \le \lvert \lVert x - y \rVert \rvert\) - and \(\lVert x - y \rVert = \lVert \sum_{i = 1}^n \vec{e_i} (x_i - y_i) \rVert\) - \(\le \sum_{i = 1}^n \lVert \vec{e_i} (x_i - y_i) \rVert = \sum_{i = 1}^n \lvert x_i - y_i \rvert \lVert \vec{e_i} \rVert\) - \(\le \max_i (\lVert \vec{e_i} \rVert) \cdot \sum_{i = 1}^n \lvert x_i - y_i \rvert = \max_i (\lVert \vec{e_i} \rVert) \cdot \lVert x - y \rVert _1\) - where \(\vec{e_1}, ..., \vec{e_n}\) is the standard basis - so if \(\delta = \frac{\varepsilon}{\max_i (\lVert \vec{e_i} \rVert)}\) - then \(\lvert x - y \rvert _1 \lt \delta \implies \lvert \lVert x \rVert - \lVert y \rVert \rvert \lt \epsilon\) - \(\lVert \cdot \rVert\) attains its minimum and maximum on the set \(\{x : \lVert x \rVert _1 = 1\}\) which is closed and bounded. - Let \(x_\min\) be such that \(\lVert x_\min \rVert \le \lVert x \rVert\) for all \(x \in \{x : \lVert x \rVert _1 \le 1\}\), - let \(x_\max\) be such that \(\lVert x_\max \rVert \ge \lVert x \rVert\) for all \(x \in \{x : \lVert x \rVert _1 \le 1\}\) - Let \(c_1 = \lVert x_\min \rVert \neq 0, c_2 = \lVert x_\max \rVert\) then for arbitrary \(x, x \neq 0\), - \(\lVert x \rVert = \lVert \frac{x}{\lVert x \rVert _1} \cdot \lVert x \rVert _1 \rVert = \lVert x \rVert _1 \cdot \lVert \frac{x}{\lVert x \rVert _1} \rVert \le \lVert x \rVert _1 \cdot \lVert x_\max \rVert = c_2 \cdot \lVert x \rVert _1\) - Similarity, \(\lVert x \rVert \ge c_1 \lVert x \rVert _1\) 
- Definition: A matrix \(A \in \mathbb{R}^{n \times n}\) is positive semidefinite, if \(x^TAx \ge 0, \forall x \in \mathbb{R}\), and \(A\) is symmetric. 
- Fact: Let \(A\) symmetric, \(A = P^TDP\) where \(P\) is orthogonal, \(D\) diagonal, - Then \(A\) is positive semidefinite iff \(D \ge 0\). 
- Proof: Say \(D = diag(\lambda_1, ..., \lambda_n), \lambda_i \lt 0\) - Let \(x = P^T\vec{e_i}\), Then \(x^TAx = (P^T\vec{e_i})^TP^TDPP^T\vec{e_i} = \vec{e_i}^TD\vec{e_i} = \lambda_i \lt 0\) - If \(D \ge 0\), then \(x^TAx = (x^TP^T)D(Px) \ge 0\) 
 
        
             
                         
                        